feat(builder): add ephemeral HTML quick deployments#3208
feat(builder): add ephemeral HTML quick deployments#3208kilo-code-bot[bot] wants to merge 3 commits into
Conversation
|
|
||
| async function cleanupExpiredDeployments(env: Env): Promise<void> { | ||
| const now = Date.now(); | ||
| const list = await env.HTML_DEPLOYMENTS_KV.list({ prefix: KV_KEY_PREFIX }); |
There was a problem hiding this comment.
WARNING: KV list() is not paginated — deployments beyond the first 1000 keys will never be cleaned up
Cloudflare KV's list() returns at most 1000 keys per call and sets list.list_complete = false when there are more. The cleanup loop needs to iterate until list_complete is true, using cursor to fetch subsequent pages:
let cursor: string | undefined;
do {
const list = await env.HTML_DEPLOYMENTS_KV.list({ prefix: KV_KEY_PREFIX, cursor });
for (const key of list.keys) {
// ... existing cleanup logic ...
}
cursor = list.list_complete ? undefined : list.cursor;
} while (cursor);Without this, once the deployment count exceeds 1000 the oldest entries are silently skipped and the Cloudflare workers they reference are never deleted.
| for (const [key, value] of formData.entries()) { | ||
| if (!(value instanceof File)) continue; | ||
|
|
||
| const path = key; |
There was a problem hiding this comment.
WARNING: Multipart form field keys are used directly as deploy paths with no sanitisation — path traversal sequences are accepted
key (the HTML form field name) is assigned directly to path with no validation. A caller can submit a field named ../../evil or __proto__ and it will be passed straight through to validateStaticAssets and ultimately into the deployer. At minimum, paths should be checked for .. components and leading / before use:
if (key.includes('..') || key.startsWith('/')) {
throw new Error(`Invalid file path: "${key}"`);
}| } | ||
|
|
||
| const record: HtmlDeployRecord = { slug, expiresAt }; | ||
| await c.env.HTML_DEPLOYMENTS_KV.put(`${KV_KEY_PREFIX}${slug}`, JSON.stringify(record)); |
There was a problem hiding this comment.
WARNING: KV write after a successful deploy is not error-handled — a failure here orphans the deployed worker permanently
If HTML_DEPLOYMENTS_KV.put() throws (e.g. quota exceeded, transient error), the Cloudflare worker has already been deployed but no cleanup record is created. The cron will never delete it because it has no KV entry. The error should be caught, logged/reported to Sentry, and ideally the newly-deployed worker should be torn down before returning a 500.
| "kv_namespaces": [ | ||
| { | ||
| "binding": "HTML_DEPLOYMENTS_KV", | ||
| "id": "TODO: create with `wrangler kv namespace create HTML_DEPLOYMENTS_KV` and fill in", |
There was a problem hiding this comment.
CRITICAL: Placeholder string "TODO: create with ..." is not a valid KV namespace ID
Deploying with this config will fail at the Wrangler validation step (wrangler deploy). The KV namespace must be created and its real UUID filled in before this can be deployed to production.
| "id": "TODO: create with `wrangler kv namespace create HTML_DEPLOYMENTS_KV` and fill in", | |
| "id": "REPLACE_ME_WITH_REAL_KV_NAMESPACE_ID", |
Code Review SummaryStatus: 1 Issue Found | Recommendation: Address before merge Executive SummaryThe Overview
✅ Fixed in latest commits
Issue Details (click to expand)CRITICAL
Files Reviewed (16 files)
Fix these issues in Kilo Cloud Reviewed by claude-4.6-sonnet-20260217 · 2,758,105 tokens Review guidance: REVIEW.md from base branch |
Review responseCRITICAL: KV namespace TODO placeholder — This requires running WARNING: KV write failure — Fixed in 2d9874e. If Already fixed in earlier pushes:
|
| constructor(ctx: DurableObjectState, env: Env) { | ||
| super(ctx, env); | ||
| this.db = drizzle(ctx.storage, { logger: false }); | ||
| void ctx.blockConcurrencyWhile(() => migrate(this.db, migrations)); |
There was a problem hiding this comment.
CRITICAL: void ctx.blockConcurrencyWhile(...) discards the Promise — migrations may not complete before the first RPC arrives
Using void here means the blockConcurrencyWhile call is fire-and-forget. Cloudflare's DO documentation requires await so that the runtime holds off all incoming requests until the callback resolves. Without await, a concurrent register() or alarm() call could execute against an uninitialized (pre-migration) database and either fail or silently corrupt state.
| void ctx.blockConcurrencyWhile(() => migrate(this.db, migrations)); | |
| await ctx.blockConcurrencyWhile(() => migrate(this.db, migrations)); |
25de5b1 to
77f46a4
Compare
Summary
Why
Generated static sites need a lightweight publishing path that does not create a repository-backed deployment or start the sandbox build pipeline. This adds an authenticated builder boundary for raw HTML and multipart static bundles while keeping public aliases separate from private Worker identifiers and making ephemeral cleanup durable across partial external failures.
What was done
POST /deploy-htmlsupport for raw HTML and multipart static bundles with a 10 MB body limit, static-path validation, a per-user Cloudflare rate limit, and friendlyadjective-noun-xxxxxxxxaliases. This PR exposes infrastructure capability only; it does not add a Kilo app UI or in-repository caller.deployments_ephemeral: insert a pending row before provisioning, activate it after alias mapping, and use claim-safe five-minute cleanup sweeps for expired or failed resources.qdpl-<uuid>Worker name, reject direct private-host routing, and map the friendly alias through the dispatcher KV API with guarded forward/reverse compensation.@kilocode/worker-utils/deployment-slugwithout changing persistent deployment URL formats.High-level architecture
sequenceDiagram actor Client as Quick-deploy client participant Builder as Builder Worker participant DB as Hyperdrive / Postgres participant CF as Cloudflare API + private qdpl Worker participant Dispatcher as Dispatcher API / public router participant KV as DEPLOY_KV Client->>Builder: POST /deploy-html with bearer token and static assets Builder->>Builder: Authenticate, rate-limit, validate upload and TTL Builder->>DB: Lock retained owner and insert pending lifecycle row Builder->>CF: Deploy assets under private qdpl UUID Builder->>DB: Check stored and ephemeral slug conflicts Builder->>Dispatcher: PUT /api/slug-mapping/{worker} Dispatcher->>KV: Update reciprocal alias keys with compensation Builder->>Dispatcher: PUT /api/app-builder-banner/{worker} (best effort) Dispatcher->>KV: Enable banner key Builder->>DB: Activate row with alias and expiry Builder-->>Client: Return friendly URL and expiry Client->>Dispatcher: GET friendly public URL Dispatcher->>KV: Resolve slug2worker mapping KV-->>Dispatcher: Return private qdpl UUID Dispatcher->>CF: Dispatch request to private Worker CF-->>Dispatcher: Return static response Dispatcher-->>Client: Return response, injecting banner when enabled opt Five-minute scheduled cleanup Builder->>DB: Claim bounded due rows with SKIP LOCKED par Independent teardown attempts Builder->>Dispatcher: DELETE alias mapping and Builder->>Dispatcher: DELETE banner key and Builder->>CF: Delete private Worker and assets end alt All teardown targets succeed Builder->>DB: Delete claimed lifecycle row else Any teardown target fails Builder->>DB: Clear claim and retry after five minutes end endArchitecture decision
Decision: Store ephemeral deployment lifecycle state in Postgres while keeping dispatcher alias routing KV-only.
Context: Quick deployments need ownership-aware cleanup, GDPR handling, retry scheduling, and recoverable claims without holding database locks across Cloudflare or dispatcher I/O. Public requests still need the existing low-latency dispatcher routing model.
Rationale: A pending Postgres row is written before external provisioning, then transitioned with compare-and-swap semantics after mapping succeeds. Scheduled workers claim bounded due rows with
FOR UPDATE SKIP LOCKED, tear down alias, banner, and Worker resources independently, and retain retry state after partial failure. Direct KV alias updates stay narrow and use guarded compensation for reciprocal keys.Consequences: Cleanup work is observable and retryable in Postgres, but alias routing remains KV-backed and eventually consistent. Alias mutations are not globally serialized, so the guarded remap and deletion compensation paths deserve focused review.
Verification
Visual Changes
Reviewer Notes
0153_nosy_psynapse, while currentmainalready contains migrations0153and0154. Regenerate migration artifacts after rebasing; do not hand-edit them.BACKEND_AUTH_TOKEN, builderDISPATCHER_AUTH_TOKEN, and webUSER_DEPLOYMENTS_DISPATCHER_AUTH_KEYto the same dispatcher management API credential./api/slug-mapping/:workerseam.